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CLAIMS; 

1. A method of determining language usage probabilities of a natural 
language based upon a training corpus, the method comprising: 

examining a training corpus, wherein such corpus includes phrases parsed 
in accordance with a set of grammar mles; 

computing probabilities of usage of combinations of linguistic features 
based upon empirical tracking of appearances of instances of such combinations in 
phrases within the training corpus. 

2. A method as recited in claim 1, wherein the combinations of 
linguistic features comprises: 

• (transition, headword, phrase level, syntactic history, segtype); 

• (headword, phrase level, syntactic history, segtype); 

• (modifying headword, transition, headword); and 

• (transition, headword). 

3. A method as recited in claim 1, wherein the combinations of 
linguistic features consist of: 

• (transition, headword, phrase level, syntactic history, segtype); 

• (headword, phrase level, syntactic history, segtype); 

• (modifying headword, transition, headword); or 

• (transition, headword). 
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4. A method as recited in claim 1, wherein the computing comprises 
counting appearances of instances of combinations of Unguistic features within the 
training corpus. 

5. A computer-readable storage medium having computer-executable 
instmctions that, when executed by a computer, performs the method as recited in 
claim 1 . 

6. A method for determining a probability at a node in a parse tree, the 
method comprising: 

receiving language-usage probabilities based upon appearances of instances 
of combinations of linguistic features within a training corpus; 

calculating the probability at the node based upon linguistic features of the 
node and the language-usage probabilities. 

7. A method as recited in claim 6, wherein the combinations of 
linguistic features comprises: 

• (transition, headword, phrase level, syntactic history, segtype); 

• (headword, phrase level, syntactic history, segtype); 

• (modifying headword, transition, headword); and 

• (transition, headword). 
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8. A method as recited in claim 6, wherein the combinations of 
Unguistic features consist of: 

• (transition, headword, phrase level, syntactic history, segtype); 

• (headword, phrase level, syntactic history, segtype); 

• (modifying headword, transition, headword); or 

• (transition, headword). 

9. A method as recited in claim 6, wherein the calculating comprises 
using PredParamRule Probability formula to calculate the probability at the node. 

10. A method as recited in claim 6, wherein the calculating comprises 
using both PredParamRule Probability and SynBigram Probability formulas to 
calculate the probability at the node. 

11. A method for determining a statistical goodness measure (SGM) of 
a parse tree representing a parse of a phrase, the parse tree comprising one or more 
nodes, the method comprising calculating a statistical product of probabilities of 
each node in the parse tree, wherein the probabilities of each node are determined 
by the method as recited in claim 6. 

12. A computer-readable storage medium having computer-executable 
instructions that, when executed by a computer, performs the method as recited in 
claim 6. 
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13. A method for determining a statistical goodness measure (SGM) of 
a parse tree representing a parse of a phrase, the parse tree comprising one or more 
nodes, the method comprising: 

combining probabilities of each node in the parse tree, wherein the 
probabilities of each node are determined by the steps comprising: 

receiving language-usage probabilities based upon appearances of 
instances of combinations of linguistic features within a training corpus; 

calculating the probabilities of each node based upon linguistic 
features of each node and the language-usage probabilities. 

14. A method as recited in claim 13, wherein the combinations of 
linguistic features comprises: 

• (transition, headword, phrase level, syntactic history, segtype); 

• (headword, phrase level, syntactic history, segtype); 

• (modifying headword, transition, headword); and 

• (transition, headword). 

15. A method as recited in claim 13, wherein the calculating comprises 
using PredParamRule Probability formula to calculate the probability at the node. 

16. A method as recited in claim 13, wherein the calculating comprises 
using both PredParamRule Probability and SynBigram Probability formulas to 
calculate the probability at the node. 
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11. A method as recited in claim 13, wherein during the combining, the 
probabiHties of each node in the parse tree are combined in a top-down, generative 
approach. 

18. A method for determining statistical goodness measures (SGMs) of 
multiple parse trees, each tree representing a syntactically valid parse of a phrase, 
the method comprising determining a SGM of each parse tree by the method as 
recited in claim 13. 

19. A method for ranking multiple parse trees, each tree representing a 
syntactically valid parse of a phrase, the method comprising: 

determining statistical goodness measures (SGMs) of each parse tree by the 
method as recited in claim 13 Jo get an SGM values associated with each tree; 
organizing the trees in order of each tree's associated SGM value. 

20. A computer-readable storage medium having computer-executable 
instructions that, when executed by a computer, performs the method as recited in 
claim 13. 

21. A method of parsing a phrase to facilitate processing of such phrase 
by a computer, the method comprising: 

generating at least one parse tree representing a syntactically valid parse of 
the phrase, wherein the parse tree has hierarchical nodes; 
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dividing each node into one or more hierarchical phrase levels, wherein the 
phrase levels at a node represent a set of possible transitions from such node that 
are allowed by a set of grammar rules. 

22. A method as recited in claim 21, wherein the set of possible 
transitions from each node consists of all possible transitions from such node that 
are allowed by a set of grammar rules. 

23. A method as recited in claim 21, wherein the set of possible 
transitions from each node includes a null transition representing an application of 
none of the grammar rules. 

24. A computer-readable storage medium having computer-executable 
instructions that, when executed by a computer, performs the method as recited in 
claim 21. 

25. A method of parsing a phrase to facilitate processing of such phrase 
by a computer, the method comprising: 

generating at least one parse tree representing a syntactically valid parse of 
the phrase, wherein the parse tree has hierarchical nodes; 
calculating a syntactic history for each node. 
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26. A method as recited in claim 25 further comprising storing the 
syntactic history for each node. 

27, A method as recited in claim 25, wherein the syntactic history may 
indicate one or more of the following syntactic phenomena: 



• passive verb phrase; 

• negative polarity; 

• domodal fronting; 

• comparative; 

• imperative; 

• topicalization of verb object. 



28. A computer-readable storage medium having computer-executable 
instructions that, when executed by a computer, performs the method as recited in 
claim 25. 

29. A computer-readable storage medium having computer-executable 
instructions that, when executed by a computer, determine language usage 
probabilities of a natural language based upon a training corpus, the method 
comprising: 

examining a training corpus, wherein such corpus includes phrases parsed 
in accordance with a set of grammar rules; 
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computing probabilities of usage of combinations of linguistic features 
based upon empirical tracking of appearances of instances of such combinations in 
phrases within the training corpus. 

30. A computer-readable storage medium having computer-executable 
instructions that, when executed by a computer, perform a method to determine a 
probability at a node in a parse tree, the method comprising: 

receiving language-usage probabilities based upon appearances of instances 
of combinations of linguistic features within a training corpus; 

calculating the probability at the node based upon linguistic features of the 
node and the language-usage probabilities. 

31. A computer-readable storage medium having computer-executable 
instructions that, when executed by a computer, perform a method to determine a 
statistical goodness measure (SGM) of a parse tree representing a parse of a 
phrase, the parse tree comprising one or more nodes, the method comprising: 

combining probabilities of each node in the parse tree, wherein the 
probabilities of each node are determined by the steps comprising: 

receiving language-usage probabilities based upon appearances of 
instances of combinations of linguistic features within a training corpus; 

calculating the probabilities of each node based upon linguistic 
features of each node and the language-usage probabilities. 
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32. A computer-readable storage medium having computer-executable 
instructions that, when executed by a computer, perform a method to rank multiple 
parse trees, each tree representing a syntactically valid parse of a phrase, the 
method comprising: 

generating at least one parse tree representing a syntactically valid parse of 
the phrase, wherein the parse tree has hierarchical nodes; 

dividing each node into one or more hierarchical phrase levels, wherein the 
phrase levels at a node represent a set of possible transitions from such node that 
are allowed by a set of grammar rules. 

33. A computer-readable storage medium having computer-executable 
instructions that, when executed by a computer, perform a method to parse a 
phrase, the method comprising: 

generating at least one parse tree representing a syntactically valid parse of 
the phrase, wherein the parse tree has hierarchical nodes; 
calculating a syntactic history for each node. 

34. An apparatus comprising: 
a processor; 

a natural-language-usage probability determiner executable on the 
processor to: 

examine a training corpus, wherein such corpus includes phrases 
parsed in accordance with a set of grammar rules; 
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compute probabilities of usage of combinations of linguistic features 
based upon empirical tracking of appearances of instances of such 
combinations in phrases within the training corpus. 

35. An apparatus as recited in claim 34, wherein the combinations of 
linguistic features comprises: 

• (transition, headword, phrase level, syntactic history, segtype); 

• (headword, phrase level, syntactic history, segtype); 

• (modifying headword, transition, headword); and 

• (transition, headword), 

36. An apparatus as recited in claim 34, wherein the combinations of 
linguistic features consist of: 

• (transition, headword, phrase level, syntactic history, segtype); 

• (headword, phrase level, syntactic history, segtype); 

• (modifying headword, transition, headword); or 

• (transition, headword). 
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37. An apparatus comprising: 
a processor; 

a natural-language-usage probability determiner executable on the 
processor to: 

receive language-usage probabilities based upon appearances of 
instances of combinations of linguistic features within a training corpus; 

calculate a probability at a node in a parse tree based upon linguistic 
features of the node and the language-usage probabilities. 



38. An apparatus as recited in claim 37, wherein the combinations of 
linguistic features comprises: 

• (transition, headword, phrase level, syntactic history, segtype); 

• (headword, phrase level, syntactic history, segtype); 

• (modifying headword, transition, headword); and 

• (transition, headword). 

39. An apparatus as recited in claim 37, wherein the combinations of 
linguistic features consist of: 

• (transition, headword, phrase level, syntactic history, segtype); 

• (headword, phrase level, syntactic history, segtype); 

• (modifying headword, transition, headword); or 

• (transition, headword). 
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40, An apparatus as recited in claim 37, wherein the determiner 
calculates the probability at the node by using PredParamRule Probability 
formula. 

41 • An apparatus as recited in claim 37, wherein the determiner 
calculates the probability at the node by using both PredParamRule Probability 
and SynBigram ProbabiUty formulas. 

42. An apparatus comprising: 
a processor; 

a natural-language-usage parser executable on the processor to: 

generate at least one parse tree representing a syntactically valid 

parse of the phrase, wherein the parse tree has hierarchical nodes; 

divide each node into one or more hierarchical phrase levels, 

wherein the phrase levels at a node represent a set of possible transitions 

from such node that are allowed by a set of grammar rules. 

43. An apparatus as recited in claim 42, wherein the set of possible 
transitions from each node includes a null transition representing an application of 
none of the grammar rules. 
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44. An apparatus comprising: 
a processor; 

a natural-language-usage parser executable on the processor to: 

generating at least one parse tree representing a syntactically valid 

parse of the phrase, wherein the parse tree has hierarchical nodes; 
calculating a syntactic history for each node. 

45. An apparatus as recited in claim 44,^ wherein the syntactic history 
may indicate one or more of the following syntactic phenomena: 

• passive verb phrase; 

• negative polarity; 

• domodal fronting; 

• comparative; 

• imperative; 

• topicalization of verb object. 

46. A natural-language-usage probability determiner comprising: 
data-acquisition device for receiving language-usage probabilities based 

upon appearances of instances of combinations of linguistic features within a 
training corpus; 

probability calculator for calculating a probability at a node of a parse tree 
based upon linguistic features of the node and the language-usage probabilities. 
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47. A data structure for use with a computer having a processor and a 
memory, said structure comprising: 

a corpus comprising one or more phrases in a natural language; 

parse trees having hierarchical nodes, each tree representing at least one 
syntactically valid parse of each phrase in a subset of the corpus; 

wherein each node has one or more hierarchical phrase levels, wherein the 
phrase levels at a node represent a set of possible transitions from such node that 
are allowed by a set of grammar rules. 

48. The structure as recited in claim 47, wherein the subset of the corpus 
includes all phrases in the corpus. 

49. A data structure for use with a computer having a processor and a 
memory, said structure comprising: 

a corpus comprising one or more phrases in a natural language; 
parse trees having hierarchical nodes, each tree representing at least one 
syntactically valid parse of each phrase in a subset of the corpus; 

wherein one or more nodes have a syntactic history associated therewith. 

50. The structure as recited in claim 49, wherein the subset of the corpus 
includes all phrases in the corpus. 
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51. A data structure for use with a computer having a processor and a 
memory, said structure comprising: 

a corpus comprising one or more phrases in a natural language; 

parse trees having hierarchical nodes, each tree representing at least one 
syntactically valid parse of each phrase in a subset of the corpus; 

wherein each node as an associated probability, wherein the associated 
probability of a node is based upon linguistic features of such node and language- 
usage probabilities derived from appearances of instances of combinations of 
linguistic features within a training corpus. 

52. A method as recited in claim 51, wherein the combinations of 
linguistic features comprises: 

• (transition, headword, phrase level, syntactic history, segtype); 

• (headword, phrase level, syntactic history, segtype); 

• (modifying headword, transition, headword); and 

• (transition, headword). 

53. A method as recited in claim 51, wherein PredParamRule 
Probability formula is used to calculate a probability associated with a node. 

54. ~A method as recited in-claim 51,- wherein both PredParamRule 
Probability and SynBigram Probability formulas are used to calculate a probability 
associated with a node. 
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55* The structure as recited in claim 51, wherein the subset of the corpus 
includes all phrases in the corpus. 

56. A program module for execution on a computing operating 
environment having a memory, the module comprising: 

a natural language phrase parser configured to generate one or more 
syntactically valid parses for a phrase, each parse may be represented by a parse 
tree having hierarchical nodes; 

a parse ranker configured to calculate a SGM for each parse of a phrase and 
to rank the parses to indicate a most probable parse; 

wherein the parse ranker comprises: 

data-acquisition device for receiving language-usage probabilities 

based upon appearances of instances of combinations of linguistic features 

within a training corpus; 

probability calculator for calculating a probability at a node of a 

parse tree based upon linguistic features of the node and the language-usage 

probabilities. 

57. A natural language processing system comprising a program module 
as recited iii claim 56. - — . . 
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58. A grammar checking system comprising a program module as 
recited in claim 56. 

59. A speech processing system comprising a program module as 
recited in claim 56. 

60. A database query processing system comprising a program module 
as recited in claim 56. 

61. An operating system comprising a program module as recited in 
claim 56. 
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